Some Two-Step Procedures for Variable Selection in High-Dimensional Linear Regression

نویسندگان

  • Jian Zhang
  • Xinge Jessie Jeng
  • Han Liu
چکیده

We study the problem of high-dimensional variable selection via some two-step procedures. First we show that given some good initial estimator which is l∞-consistent but not necessarily variable selection consistent, we can apply the nonnegative Garrote, adaptive Lasso or hard-thresholding procedure to obtain a final estimator that is both estimation and variable selection consistent. Unlike the Lasso, our results do not require the irrepresentable condition which could fail easily even for moderate pn (Zhao and Yu, 2007) and it also allows pn to grow almost as fast as exp(n) (for hardthresholding there is no restriction on pn). We also study the conditions under which the Ridge regression can be used as an initial estimator. We show that under a relaxed identifiable condition, the Ridge estimator is l∞-consistent. Such a condition is usually satisfied when pn ≤ n and does not require the partial orthogonality between relevant and irrelevant covariates which is needed for the univariate regression in (Huang et al., 2008). Our numerical studies show that when using the Lasso or Ridge as initial estimator, the two-step procedures have a higher sparsity recovery rate than the Lasso or adaptive Lasso with univariate regression used in (Huang et al., 2008).

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Variable Selection in Single Index Quantile Regression for Heteroscedastic Data

Quantile regression (QR) has become a popular method of data analysis, especially when the error term is heteroscedastic, due to its relevance in many scientific studies. The ubiquity of high dimensional data has led to a number of variable selection methods for linear/nonlinear QR models and, recently, for the single index quantile regression (SIQR) model. We propose a new algorithm for simult...

متن کامل

Variable selection in linear models

Variable selection in linear models is essential for improved inference and interpretation, an activity which has become even more critical for high dimensional data. In this article, we provide a selective review of some classical methods including Akaike information criterion, Bayesian information criterion, Mallow’s Cp and risk inflation criterion, as well as regularization methods including...

متن کامل

A Note on the Lasso and Related Procedures in Model Selection

The Lasso, the Forward Stagewise regression and the Lars are closely related procedures recently proposed for linear regression problems. Each of them can produce sparse models and can be used both for estimation and variable selection. In practical implementations these algorithms are typically tuned to achieve optimal prediction accuracy. We show that, when the prediction accuracy is used as ...

متن کامل

A Stepwise Regression Method and Consistent Model Selection for High-dimensional Sparse Linear Models by Ching-kang Ing

We introduce a fast stepwise regression method, called the orthogonal greedy algorithm (OGA), that selects input variables to enter a p-dimensional linear regression model (with p >> n, the sample size) sequentially so that the selected variable at each step minimizes the residual sum squares. We derive the convergence rate of OGA as m = mn becomes infinite, and also develop a consistent model ...

متن کامل

Hyperspectral Image Classification Based on the Fusion of the Features Generated by Sparse Representation Methods, Linear and Non-linear Transformations

The ability of recording the high resolution spectral signature of earth surface would be the most important feature of hyperspectral sensors. On the other hand, classification of hyperspectral imagery is known as one of the methods to extracting information from these remote sensing data sources. Despite the high potential of hyperspectral images in the information content point of view, there...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2008